Overview

Dataset Statistics

Number of Variables 19
Number of Rows 7802
Missing Cells 0
Missing Cells (%) 0.0%
Duplicate Rows 1
Duplicate Rows (%) 0.0%
Total Size in Memory 8.4 MB
Average Row Size in Memory 1.1 KB
Variable Types
  • Categorical: 15
  • Numerical: 4

Dataset Insights

Novedad is skewed Skewed
Fecha has a high cardinality: 365 distinct values High Cardinality
Edad has a high cardinality: 96 distinct values High Cardinality
Calle has a high cardinality: 1161 distinct values High Cardinality
Departamento has constant value " MONTEVIDEO" Constant
Fecha has constant length 10 Constant Length
Departamento has constant length 11 Constant Length

Variables


Fecha

categorical

Approximate Distinct Count 365
Approximate Unique (%) 4.7%
Missing 0
Missing (%) 0.0%
Memory Size 585150

Length

Mean 10
Standard Deviation 0
Median 10
Minimum 10
Maximum 10

Sample

1st row 01/01/2022
2nd row 01/01/2022
3rd row 01/01/2022
4th row 01/01/2022
5th row 01/01/2022

Letter

Count 0
Lowercase Letter 0
Space Separator 0
Uppercase Letter 0
Dash Punctuation 0
Decimal Number 62416
  • Fecha has words of constant length

Edad

categorical

Approximate Distinct Count 96
Approximate Unique (%) 1.2%
Missing 0
Missing (%) 0.0%
Memory Size 523545

Length

Mean 2.1039
Standard Deviation 1.0636
Median 2
Minimum 1
Maximum 10

Sample

1st row 37
2nd row 45
3rd row 69
4th row 31
5th row 19

Letter

Count 1080
Lowercase Letter 0
Space Separator 270
Uppercase Letter 1080
Dash Punctuation 0
Decimal Number 15065

Rol

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 580060
  • The largest value ( CONDUCTOR) is over 2.75 times larger than the second largest value ( PASAJERO)

Length

Mean 9.3476
Standard Deviation 1.0362
Median 10
Minimum 7
Maximum 10

Sample

1st row PASAJERO
2nd row CONDUCTOR
3rd row CONDUCTOR
4th row CONDUCTOR
5th row CONDUCTOR

Letter

Count 65128
Lowercase Letter 0
Space Separator 7802
Uppercase Letter 65128
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( CONDUCTOR, PASAJERO) take over 50.0%
  • The largest value (conductor) is over 2.75 times larger than the second largest value (pasajero)

Calle

categorical

Approximate Distinct Count 1161
Approximate Unique (%) 14.9%
Missing 0
Missing (%) 0.0%
Memory Size 692365

Length

Mean 19.4979
Standard Deviation 8.8087
Median 18
Minimum 3
Maximum 64

Sample

1st row CAMINO CARMELO CO...
2nd row CAMINO CARMELO CO...
3rd row BOULEVARD JOSE BA...
4th row SIN DATOS
5th row JOANICO

Letter

Count 129950
Lowercase Letter 290
Space Separator 21165
Uppercase Letter 129660
Dash Punctuation 92
Decimal Number 195
  • The largest value (avenida) is over 2.61 times larger than the second largest value (de)

Zona

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 592855
  • The largest value ( URBANA ) is over 5.19 times larger than the second largest value ( SUBURBANA )

Length

Mean 10.9876
Standard Deviation 0.1108
Median 11
Minimum 10
Maximum 11

Sample

1st row URBANA
2nd row URBANA
3rd row URBANA
4th row URBANA
5th row URBANA

Letter

Count 50609
Lowercase Letter 0
Space Separator 35116
Uppercase Letter 50609
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( URBANA , SUBURBANA ) take over 50.0%
  • The largest value (urbana) is over 5.19 times larger than the second largest value (suburbana)

Tipo de resultado

categorical

Approximate Distinct Count 4
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 602976
  • The largest value ( HERIDO LEVE) is over 7.76 times larger than the second largest value ( HERIDO GRAVE)

Length

Mean 12.2848
Standard Deviation 1.6736
Median 12
Minimum 12
Maximum 34

Sample

1st row HERIDO LEVE
2nd row HERIDO LEVE
3rd row HERIDO LEVE
4th row HERIDO LEVE
5th row HERIDO LEVE

Letter

Count 80017
Lowercase Letter 0
Space Separator 15829
Uppercase Letter 80017
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( HERIDO LEVE, HERIDO GRAVE) take over 50.0%

Tipo de siniestro

categorical

Approximate Distinct Count 6
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 1180144
  • The largest value ( COLISION ENTRE VEHI�CULOS) is over 6.03 times larger than the second largest value ( ATROPELLO DE PEATON)

Length

Mean 23.022
Standard Deviation 6.3353
Median 26
Minimum 7
Maximum 35

Sample

1st row COLISION ENTRE VE...
2nd row COLISION ENTRE VE...
3rd row CAI�DA
4th row COLISION ENTRE VE...
5th row COLISION ENTRE VE...

Letter

Count 151514
Lowercase Letter 0
Space Separator 21642
Uppercase Letter 151514
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( COLISION ENTRE VEHI�CULOS, ATROPELLO DE PEATON) take over 50.0%

Usa cinturI�n

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 587208
  • The largest value ( SIN DATOS) is over 21.75 times larger than the second largest value ( NO USA CINTURON)

Length

Mean 10.2638
Standard Deviation 1.2302
Median 10
Minimum 10
Maximum 16

Sample

1st row SIN DATOS
2nd row SIN DATOS
3rd row SIN DATOS
4th row SIN DATOS
5th row SIN DATOS

Letter

Count 64131
Lowercase Letter 0
Space Separator 15947
Uppercase Letter 64131
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( SIN DATOS, NO USA CINTURON) take over 50.0%

Usa casco

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 588648

Length

Mean 10.4483
Standard Deviation 1.0697
Median 10
Minimum 10
Maximum 13

Sample

1st row SIN DATOS
2nd row SIN DATOS
3rd row USA CASCO
4th row SIN DATOS
5th row NO USA CASCO

Letter

Count 64748
Lowercase Letter 0
Space Separator 16770
Uppercase Letter 64748
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( SIN DATOS, USA CASCO) take over 50.0%

DI�a de la semana

categorical

Approximate Distinct Count 7
Approximate Unique (%) 0.1%
Missing 0
Missing (%) 0.0%
Memory Size 618209

Length

Mean 7.763
Standard Deviation 1.2151
Median 8
Minimum 6
Maximum 10

Sample

1st row SI�BADO
2nd row SI�BADO
3rd row SI�BADO
4th row SI�BADO
5th row SI�BADO

Letter

Count 51617
Lowercase Letter 0
Space Separator 7802
Uppercase Letter 51617
Dash Punctuation 0
Decimal Number 0

Sexo

categorical

Approximate Distinct Count 3
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 582350
  • The largest value ( MASCULINO) is over 1.79 times larger than the second largest value ( FEMENINO)

Length

Mean 9.6411
Standard Deviation 0.4797
Median 10
Minimum 9
Maximum 10

Sample

1st row MASCULINO
2nd row MASCULINO
3rd row MASCULINO
4th row MASCULINO
5th row MASCULINO

Letter

Count 67415
Lowercase Letter 0
Space Separator 7805
Uppercase Letter 67415
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( MASCULINO, FEMENINO) take over 50.0%
  • The largest value (masculino) is over 1.79 times larger than the second largest value (femenino)

Hora

numerical

Approximate Distinct Count 24
Approximate Unique (%) 0.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 124832
Mean 13.8779
Minimum 0
Maximum 23
Zeros 161
Zeros (%) 2.1%
Negatives 0
Negatives (%) 0.0%
  • Hora is skewed left (γ1 = -0.5091)

Quantile Statistics

Minimum 0
5-th Percentile 4
Q1 10
Median 15
Q3 18
95-th Percentile 22
Maximum 23
Range 23
IQR 8

Descriptive Statistics

Mean 13.8779
Standard Deviation 5.483
Variance 30.0637
Sum 108275
Skewness -0.5091
Kurtosis -0.3503
Coefficient of Variation 0.3951

Departamento

categorical

Approximate Distinct Count 1
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 592952

Length

Mean 11
Standard Deviation 0
Median 11
Minimum 11
Maximum 11

Sample

1st row MONTEVIDEO
2nd row MONTEVIDEO
3rd row MONTEVIDEO
4th row MONTEVIDEO
5th row MONTEVIDEO

Letter

Count 78020
Lowercase Letter 0
Space Separator 7802
Uppercase Letter 78020
Dash Punctuation 0
Decimal Number 0
  • Departamento has words of constant length

Localidad

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 592593
  • The largest value ( MONTEVIDEO) is over 20.73 times larger than the second largest value ( SIN DATOS)

Length

Mean 10.954
Standard Deviation 0.2095
Median 11
Minimum 10
Maximum 11

Sample

1st row MONTEVIDEO
2nd row MONTEVIDEO
3rd row MONTEVIDEO
4th row SIN DATOS
5th row MONTEVIDEO

Letter

Count 77302
Lowercase Letter 0
Space Separator 8161
Uppercase Letter 77302
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( MONTEVIDEO, SIN DATOS) take over 50.0%
  • The largest value (montevideo) is over 20.73 times larger than the second largest value (datos)

Novedad

numerical

Approximate Distinct Count 6333
Approximate Unique (%) 81.2%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 124832
Mean 1.519e+07
Minimum 1.4067e+07
Maximum 1.6375e+07
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Novedad is skewed left (γ1 = -0.018)

Quantile Statistics

Minimum 1.4067e+07
5-th Percentile 1.4216e+07
Q1 1.4654e+07
Median 1.5184e+07
Q3 1.573e+07
95-th Percentile 1.6144e+07
Maximum 1.6375e+07
Range 2.3083e+06
IQR 1.0762e+06

Descriptive Statistics

Mean 1.519e+07
Standard Deviation 619957.3098
Variance 3.8435e+11
Sum 1.1851e+11
Skewness -0.01801
Kurtosis -1.1925
Coefficient of Variation 0.04081
  • Novedad is not normally distributed (p-value 1.053131074488801e-09)

Tipo de Vehiculo

categorical

Approximate Distinct Count 12
Approximate Unique (%) 0.2%
Missing 0
Missing (%) 0.0%
Memory Size 556700
  • The largest value ( MOTO) is over 2.18 times larger than the second largest value ( AUTO)

Length

Mean 6.3535
Standard Deviation 2.2226
Median 5
Minimum 5
Maximum 16

Sample

1st row AUTO
2nd row AUTO
3rd row MOTO
4th row AUTO
5th row MOTO

Letter

Count 40680
Lowercase Letter 0
Space Separator 8890
Uppercase Letter 40680
Dash Punctuation 0
Decimal Number 0
  • The top 2 categories ( MOTO, AUTO) take over 50.0%
  • The largest value (moto) is over 2.18 times larger than the second largest value (auto)

fixed

categorical

Approximate Distinct Count 2
Approximate Unique (%) 0.0%
Missing 0
Missing (%) 0.0%
Memory Size 571029
  • The largest value ( SIN DATOS) is over 3.97 times larger than the second largest value (1)

Length

Mean 8.1901
Standard Deviation 3.6077
Median 10
Minimum 1
Maximum 10

Sample

1st row 1
2nd row 1
3rd row 1
4th row 1
5th row 1

Letter

Count 49864
Lowercase Letter 0
Space Separator 12466
Uppercase Letter 49864
Dash Punctuation 0
Decimal Number 1569
  • The top 2 categories ( SIN DATOS, 1) take over 50.0%

X

numerical

Approximate Distinct Count 2131
Approximate Unique (%) 27.3%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 124832
Mean 575627.6442
Minimum 556200
Maximum 588320
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • X is skewed left (γ1 = -0.3494)

Quantile Statistics

Minimum 556200
5-th Percentile 567310
Q1 573126.25
Median 575970
Q3 578595
95-th Percentile 583575
Maximum 588320
Range 32120
IQR 5468.75

Descriptive Statistics

Mean 575627.6442
Standard Deviation 4739.0055
Variance 2.2458e+07
Sum 4.491e+09
Skewness -0.3494
Kurtosis 0.7423
Coefficient of Variation 0.008233
  • X is not normally distributed (p-value 0.005168890599737926)
  • X has 199 outliers

Y

numerical

Approximate Distinct Count 1945
Approximate Unique (%) 24.9%
Missing 0
Missing (%) 0.0%
Infinite 0
Infinite (%) 0.0%
Memory Size 124832
Mean 6.1418e+06
Minimum 6.1345e+06
Maximum 6.1574e+06
Zeros 0
Zeros (%) 0.0%
Negatives 0
Negatives (%) 0.0%
  • Y is skewed right (γ1 = 0.7806)

Quantile Statistics

Minimum 6.1345e+06
5-th Percentile 6.1368e+06
Q1 6.1389e+06
Median 6.141e+06
Q3 6.1441e+06
95-th Percentile 6.1491e+06
Maximum 6.1574e+06
Range 22925
IQR 5208.75

Descriptive Statistics

Mean 6.1418e+06
Standard Deviation 3831.7781
Variance 1.4683e+07
Sum 4.7918e+10
Skewness 0.7806
Kurtosis 0.2154
Coefficient of Variation 0.00062389
  • Y is not normally distributed (p-value 0.009577846472854808)
  • Y has 96 outliers

Interactions

Correlations

Missing Values